Approximating a similarity matrix by a latent class model: A reappraisal of additive fuzzy clustering
نویسندگان
چکیده
Let Q be a given n× n square symmetric matrix of nonnegative elements between 0 and 1, e.g. similarities. Fuzzy clustering results in fuzzy assignment of individuals to K clusters. In additive fuzzy clustering, the n × K fuzzy memberships matrix P is found by leastsquares approximation of the off-diagonal elements of Q by inner products of rows of P. By contrast, kernelized fuzzy c-means is not least-squares and requires an additional fuzziness parameter. The aim is to popularize additive fuzzy clustering by interpreting it as a latent class model, whereby the elements of Q are modeled as the probability that two individuals share the same class on the basis of the assignment probability matrix P. Two new algorithms are provided, a brute force genetic algorithm (differential evolution) and an iterative row-wise quadratic programming algorithm of which the latter is the more effective. Simulations showed that (1) the method usually has a unique solution, except in special cases, (2) both algorithms reached this solution from random restarts and (3) the number of clusters can be well estimated by AIC. Additive fuzzy clustering is computationally efficient and combines attractive features of both the vector model and the cluster model. © 2008 Elsevier B.V. All rights reserved.
منابع مشابه
Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering
Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...
متن کاملDeveloping Additive Spectral Approach to Fuzzy Clustering
An additive spectral method for fuzzy clustering is presented. The method operates on a clustering model which is an extension of the spectral decomposition of a square matrix. The computation proceeds by extracting clusters one by one, which allows us to draw several stopping rules to the procedure. We experimentally test the performance of our method and show its competitiveness. In spite of ...
متن کاملA computational method to analyze the similarity of biological sequences under uncertainty
In this paper, we propose a new method to analyze the difference and similarity of biological sequences, based on the fuzzy sets theory. Considering the sequence order and some chemical and structural properties, we present a computational method to cluster the biological sequences. By some examples, we show that the new method is relatively easy and we are able to compare the sequences of arbi...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملThematic Fuzzy Clusters with an Additive Spectral Approach
This paper introduces an additive fuzzy clustering model for similarity data as oriented towards representation and visualization of activities of research organizations in a hierarchical taxonomy of the field. We propose a one-by-one cluster extracting strategy which leads to a version of spectral clustering approach for similarity data. The derived fuzzy clustering method, FADDIS, is experime...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 53 شماره
صفحات -
تاریخ انتشار 2009